Biostatistics For Dummies, 2nd Edition (Monika Wahi, John Pezzullo)

CHAPTER 18 A Yes-or-No Proposition: Logistic Regression 259

You may see the following model fit measures, depending on your software:»

» ^{A p value associated with the decrease in deviance between the null}

model and the final model: This information is shown in Figure 18-4a under

Deviance. Under α = 0.05, if this p value < 0.05, it indicates that adding the

predictor variables to the null model statistically significantly improves its

ability to predict the outcome. In Figure 18-4a, p < 0.0001, which means that

adding radiation dose to the model makes it statistically significantly better at

predicting an individual animal’s chance of dying than the null model.

However, it’s not very hard for a model with any predictors to be better than

the null model, so this is not a very sensitive model fit statistic.»

» ^{A p value from the Hosmer-Lemeshow (H-L) test:}^{In Figure 18-4a, this is}

listed under Hosmer-Lemeshow Goodness of Fit Test. The null hypothesis for this

test is your data are consistent with the logistic function’s S shape, so if p <

0.05, your data do not qualify for logistic regression. The focus of the test is to

see if the S is getting distorted at very high or very low levels of the predictor

(as shown in Figure 18-4b). In Figure 18-4a, the H-L p value is 0.842, which

means that the data are consistent with the shape of a logistic curve.»

» ^{One or more pseudo–}^r²^values:^Pseudo–r²^values^{indicate how much of the}

total variability in the outcome is explainable by the fitted model. They are

analogous to how r² is interpreted in ordinary least-squares regression, as

described in Chapter 17. In Figure 18-4a, two such values are provided under

the labels Cox/Snell R-square and Nagelkerke R-square. The Cox/Snell r² is 0.577,

and the Nagelkerke r² is 0.770, both of which indicate that a majority of the

variability in the outcome is explainable by the logistic model.»

» ^{Akaike’s Information Criterion (AIC):}^{AIC is a measure of the final model}

deviance adjusted for how many predictor variables are in the model. Like

deviance, the smaller the AIC, the better the fit. The AIC is not very useful on

its own, and is instead used for choosing between different models. When all

the predictors in one model are nested — or included — in another model

with more predictors, the AIC is helpful for comparing these models to see if it

is worth adding the extra predictors.

Checking out the table of regression

coefficients

Your intention when developing a logistic regression model is to obtain estimates

from the table of coefficients, which looks much like the coefficients table from

ordinary straight-line or multivariate least-squares regression (see Chapters 16

and 17). In Figure 18-4a, they are listed under Coefficients and Standard Errors. Observe:»

» ^{Every predictor variable appears on a separate row.}